infographic chart
InfoChartQA: A Benchmark for Multimodal Question Answering on Infographic Charts
Xie, Tianchi, Lin, Minzhi, Liu, Mengchen, Ye, Yilin, Chen, Changjian, Liu, Shixia
Understanding infographic charts with design-driven visual elements (e.g., pictograms, icons) requires both visual recognition and reasoning, posing challenges for multimodal large language models (MLLMs). However, existing visual-question answering benchmarks fall short in evaluating these capabilities of MLLMs due to the lack of paired plain charts and visual-element-based questions. To bridge this gap, we introduce InfoChartQA, a benchmark for evaluating MLLMs on infographic chart understanding. It includes 5,642 pairs of infographic and plain charts, each sharing the same underlying data but differing in visual presentations. We further design visual-element-based questions to capture their unique visual designs and communicative intent. Evaluation of 20 MLLMs reveals a substantial performance decline on infographic charts, particularly for visual-element-based questions related to metaphors. The paired infographic and plain charts enable fine-grained error analysis and ablation studies, which highlight new opportunities for advancing MLLMs in infographic chart understanding. We release InfoChartQA at https://github.com/CoolDawnAnt/InfoChartQA.
- North America > United States > New York (0.04)
- Europe > Poland (0.04)
- Europe > Middle East > Malta (0.04)
- (2 more...)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.92)
- Education (0.67)
- Information Technology > Services (0.46)
- Banking & Finance > Economy (0.46)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.84)
ChartGalaxy: A Dataset for Infographic Chart Understanding and Generation
Li, Zhen, Li, Duan, Guo, Yukai, Guo, Xinyuan, Li, Bowen, Xiao, Lanxi, Qiao, Shenyu, Chen, Jiashu, Wu, Zijian, Zhang, Hui, Shu, Xinhuan, Liu, Shixia
Infographic charts are a powerful medium for communicating abstract data by combining visual elements (e.g., charts, images) with textual information. However, their visual and structural richness poses challenges for large vision-language models (LVLMs), which are typically trained on plain charts. To bridge this gap, we introduce ChartGalaxy, a million-scale dataset designed to advance the understanding and generation of infographic charts. The dataset is constructed through an inductive process that identifies 75 chart types, 440 chart variations, and 68 layout templates from real infographic charts and uses them to create synthetic ones programmatically. We showcase the utility of this dataset through: 1) improving infographic chart understanding via fine-tuning, 2) benchmarking code generation for infographic charts, and 3) enabling example-based infographic chart generation. By capturing the visual and structural complexity of real design, ChartGalaxy provides a useful resource for enhancing multimodal reasoning and generation in LVLMs.
- North America > Canada > Ontario > Toronto (0.04)
- Europe > Spain (0.04)
- Europe > Italy (0.04)
- (5 more...)
- Energy (0.67)
- Health & Medicine (0.67)
- Media (0.46)
- Information Technology (0.46)